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1 Introduction 



Abstract. We study the complexity of the popular one player combinatorial game 
known as Flood- It. In this game the player is given an nxn board of tiles where 
each tile is allocated one of c colours. The goal is to make the colours of all tiles 
equal via the shortest possible sequence of flooding operations. In the standard 
version, a flooding operation consists of the player choosing a colour k, which 
then changes the colour of all the tiles in the monochromatic region connected 
^ ■ to the top left tile to k. After this operation has been performed, neighbouring 

' regions which are already of the chosen colour k will then also become connected, 

thereby extending the monochromatic region of the board. We show that finding 
the minimum number of flooding operations is NP-hard for c ^ 3 and that this 
even holds when the player can perform flooding operations from any position on 
the board. However, we show that this 'free' variant is in P for c — 2. We also prove 
that for an unbounded number of colours, Flood-It remains NP-hard for boards 
of height at least 3, but is in P for boards of height 2. Next we show how a (c — 1) 
■ approximation and a randomised 2c/3 approximation algorithm can be derived, 

and that no polynomial time constant factor, independent of c, approximation 
algorithm exists unless P=NP. We then investigate how many moves are required 
^ ' for the 'most demanding' nxn boards (those requiring the most moves) and show 

that the number grows as fast as ©(y^n). Finally, we consider boards where the 
colours of the tiles are chosen at random and show that for c ^ 2, the number of 
CO ' moves required to flood the whole board is n{n) with high probability. 

> 

o 

In the popular one player combinatorial game known as Flood- It, each tile of an 
• nxn board is allocated one of c colours, where c is a parameter of the game. 

O . Two left /right /up/down adjacent tiles are said to be connected if they have the 

same colour and a (connected) region of the board is defined to be any maximal 
connected component. The standard version of the game starts with the player 
'flooding' the region that contains the top left tile. The flooding operation simply 
involves changing the colour of all the tiles in the region to be some new colour. 
However, this also has the effect of connecting the newly flooded region to all 
neighbouring regions of this colour. The overall aim is to flood the entire board, 
that is connect all regions, in as few flooding operations as possible. Every flooding 
operation changes the colour of the region that contains the top left tile. Figure 1 
gives an example of the first few moves of a game. The border shows the outline 
of the region which has so far been flooded. 

In this paper, we investigate a number of questions inspired by Flood-It. We 
first show that not only are natural greedy approaches to the game bad, but in 
fact finding an optimal solution (one which requires the fewest possible moves) 
for Flood-It is NP-hard for c ^ 3, and that this also holds for a variant of the 
game we call Free-Flood-It where the player can perform flooding operations at 
any position on the board. On the other hand, we show that solving Free-Flood-It 
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Fig. 1: A sequence of four moves on a 6x6 Flood-It board with 3 colours. 



with c = 2 is in P. We also consider the effect of changing the shape of the board, 
and prove that Flood-It remains NP-hard for rectangular boards of height at 
least 3, with an unbounded number of colours, but is in P for boards of height 
2. As a stepping stone, we also prove NP-hardness of a restricted version of the 
well-studied shortest common supersequence problem (q.v.). 

Next we show how a (c — 1) approximation and a randomised 2c/3 approxima- 
tion algorithm for Flood-It can be derived. However, no polynomial time constant 
factor, independent of c, approximation algorithm exists unless P=NP. We then 
consider how many moves are required for the most demanding boards and show 
that the number grows as fast as 0{y/cn). We say that a board is one of the 
most demanding boards if it requires at least as many moves as any other board 
which has the same size and number of colours. Finally, we investigate boards 
where the colours of the tiles are chosen at random and give a simple proof that 
for c ^ 3, the number of moves required to flood the whole board is i7(n) with 
high probability. We then observe that the same result can in fact be proven for 
c ^ 2 by appealing to previous deep results in percolation theory [3,7]; indeed, 
our work can be seen as a drastic simplification of these results for the case c ^ 3. 

History and related work: Perhaps the most famous recent hardness result involv- 
ing a popular game is the NP-completeness of Tetris [4]. Flood- It seems to be 
a somewhat newer game than Tetris, first making its appearance online in early 
2006 courtesy of a company called Lab Pixies. Since then numerous versions have 
become available for almost every conceivable platform. We have very recently be- 
come aware of a sketch proof by Elad Verbin posted on a blog of the NP-hardness 
of Flood- It with 6 colours [18]. Although our work was completed independently, 
it is interesting to note that there is some similarity to the techniques used in our 
NP-hardness proof for c ^ 3 colours. 

Independently of this work, Fleischer and Woeginger have studied a closely 
related game to Flood-It, known as Honey-Bee [5]. This game is also based around 
repeatedly applying a flood filling operation on a grid. The main differences are 
that the grid is hexagonal and may contain barriers, and also that there is a 
two-player variant of the game. In this variant, two players start flood filling from 
opposite corners, and the goal is to control more of the board than your opponent. 
Fleischer and Woeginger focus on the computational complexity of Honey-Bee, 
and consider a number of generalisations of the single player game to different 
classes of graphs. They prove that some generalisations are NP-hard, while others 
are in P. Again, there is some similarity in the techniques used in one of their 
NP-hardness proofs, although we note that this proof does not immediately apply 
to Flood- It without some modification. Fleischer and Woeginger also show that 
the two-player game on arbitrary graphs is PSPACE-complete. 
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Fig. 2: (a) An alternating 4-diamond and (b) a cropped 6-diamond. 

Another related game whose computational complexity has been studied in 
detail is known as Clickomania [2]. A rectangular board is initialised in the same 
way as in Flood-It. The move permitted is for the player to remove a chosen 
connected monochromatic component of at least two tiles after which any blocks 
above it will fall down as far as they can. Finding an optimal solution to Clicko- 
mania has been shown to be NP-hard for two or more columns and five or more 
colours, or five or more columns and three or more colours. 

There is also existing work on a majority-based recolouring game on graphs 
[1,6,15]. The game is played over a number of rounds on a simple undirected graph 
where each vertex is initially coloured white or black. In each round each vertex 
is recoloured by the colour of the majority of its neighbours. The player's only 
interaction is to determine the set of vertices which are initially coloured white. 
The goal is to pick the smallest possible set of vertices such that after a finite 
number of rounds, all vertices are white. 

Flood-It can be thought of as a model for a number of different (possibly 
not entirely) real world applications. For example, our results supplement that 
of recent work on zombie infestation [14] if one regards the flooding operation as 
one where the minds of neighbouring non-zombies are infected by those who have 
already been turned into zombies. A separate but no less significant line of research 
considers the complexity of tools commonly provided with Microsoft Windows. 
Previous work has shown that aspects of Excel [9] and even Minesweeper [11] are 
NP-complete. Our work extends this line of research by showing that flood filling 
in Microsoft's Paint application is also NP-hard. 

1.1 Notation and definitions 

Let i?n,c be the set of all nxn boards with at most c colours. We write m{B) 
for the minimum number of moves required to flood a board B G -Bn,c- We will 
refer to rows and columns in a board in the usual manner. We further denote 
the colour of the tile in row i and column j as B[i,j]] colours are represented by 
integers between 1 and c. Throughout we assume that 2 ^ c ^ n^. 

We define a diamond to be a diamond-shaped subset of the board (see Fig- 
ure 2a). These structures are used throughout the paper. The centre of the dia- 
mond is a single tile and the radius is the number of tiles from its centre to its 
leftmost tile. We write r-diamond to denote a diamond of radius r. A single tile 
is therefore a 1-diamond. For i € {1, . . . , r}, the ith layer of an r-diamond is the 
set of tiles at board distance i — 1 from its centre. We will also consider diamonds 
which are cropped by intersection with the board edges as in Figure 2b. 



3 



Fig. 3: A lOx 10 board where a greedy approach is bad. 



2 A greedy approach is bad 

An obvious strategy for playing the Flood-It game is the greedy approach. There 
are two natural greedy algorithms: (1) we pick the colour that results in the 
largest gain (number of acquired tiles), or (2) we choose the colour dominating the 
perimeter of the currently flooded region. It turns out that both these approaches 
can be surprisingly bad. 

To see this, let B be the 10 x 10 board on three colours illustrated in Figure 3. 
The number of moves required to flood B is three. However, either greedy ap- 
proach given would first pick the colours appearing on the horizontal lines before 
finally choosing to flood the left-hand vertical column. In both cases, this requires 
10 moves to fill the board. It should be clear how this example can easily be 
extended to arbitrarily large nxn boards. In general, the greedy algorithm will 
make n moves, while the optimal algorithm will still make only three. 

3 The complexity of Flood-It 

Let c-Flood-It denote the problem which takes as input an n x n board B oi c 
colours and outputs the minimum number of moves m(B) in a Flood-It game that 
are required to flood B. Similarly, let c-Free-Flood-It denote the generalised 
version of c-Flood-It in which we are free to flood fill from an arbitrary tile 
in each move. Although we have seen that a straightforward greedy algorithm 
fails, it is not too far-fetched to think that a dynamic programming approach 
would solve these problems efficiently, but the longer one ponders over it, the 
more inconceivable it seems. To aid frustrated Flood-It enthusiasts, we prove in 
this section that both c-Flood-It and c-Free-Flood-It are indeed NP-hard, 
even when the number of colours is as small as three. Interestingly, we will see 
that 2-Free-Flood-It is in P. 

To show NP-hardness, we reduce from the shortest common supersequence 
problem, denoted SCS, which is defined as follows. The input is a set S oi k 
strings over an alphabet E. A common supersequence s of the strings in 5 is a 
string such that every string in 5 is a subsequence of s. The output is the length 
of a shortest common supersequence of the strings in S. The decision version 
of SCS takes an additional integer i and outputs yes if the shortest common 
supersequence has length at most £, otherwise it outputs no. 

Maier [13] showed in 1978 that the decision version of SCS is NP-complete 
if the alphabet size \U\ ^ 5. A couple of years later, Raiha and Ukkonen [16] 
extended this result to hold for \U\ ^ 2. For a long time, various groups of people 



tried to approximate SCS but no polynomial-time algorithm with guaranteed 
approximation bound was to be found. It was not until 1995 that Jiang and 
Li [10] settled this open problem by proving that no polynomial-time algorithm 
can achieve a constant approximation ratio for SCS, unless P = NP. Their result 
holds for an unbounded alphabet. 

The following lemma proves the NP-hardness of both c-Flood-It and c- 
Free-Flood-It when the number of colours is at least four. The inapproxima- 
bility of both problems follows immediately from the approximation preserving 
nature of the reduction. However, in the reduction we present, the number of 
colours in the c-Flood-It instance will be exactly twice the number of alphabet 
symbols in the SCS instance. For this reason, our inapproximability results only 
hold when the number of colours is unbounded. We will need a more specialised 
reduction for the case c = 3, which is given in Lemma 2. 

Lemma 1. For c ^ 4, c-Flood-It and c-Free-Flood-It are NP-hard (and 
the decision versions are 'NF -complete). Further, for an unbounded number of 
colours c, there is no polynomial-time constant factor approximation algorithm, 
unless P = NP. 

Proof. The proof is split into two parts; first we prove the lemma for c-Flood-It 
in which we flood fill from the top left tile in each move, and in the second part 
we generalise the proof to c-Free-Flood-It in which we can flood fill from any 
tile in each move. 

We reduce from an instance of SCS that contains k strings si, . . . ,Sk each of 
length at most w over the alphabet S. Suppose that S = {ai, . . . ,ar} contains 
r ^ 2 letters and let U' = {bi, . . . ,br} be an alphabet with r new letters. For 
i G {1, . . . , A:}, let s[ be the string obtained from Sj by inserting the character bj 
after each aj and inserting the character bi at the very front. For example, from 
the string 03010403 we get &1O363O151O464O363. 

Let UUX!' represent the set of 2r colours that we will use to construct a board 
B. First, for i G {1, . . . , k}, we define the |s^|-diamond Di such that the jth layer 
will contain only one colour which will be the j'th character from the right-hand 
end of s[. Thus, the colour of the outermost layer of Di is the first character 
of s[ (which is bi for all strings) and the centre of Di is the last character of 
s'i. The reason why we intersperse the strings with letters from the auxiliary 
alphabet U' is to ensure that no two adjacent layers of a diamond have the same 
colour. This property is crucial in our proof. Let B he a sufficiently large nxn 
board constructed by first colouring the whole board with the colour bi and then 
placing the k diamonds Di on B such that no two diamonds overlap. Since each 
of the k diamonds has a radius of at most 2w -|- 1, we can be assured that n never 
has to be greater than k^Aw + I). 

Suppose that s is a shortest common supersequence of si, . . . , Sfc and suppose 
its length is i. We will now argue that the minimum number of moves to flood B 
is exactly 2i, first showing that 2i moves are sufficient. Let s' be the 2£-long string 
obtained from s by inserting the character bj after each Uj. We make 2i moves by 
choosing the colours in the same order as they appear in s' . Note that we flood 
fill from the top left tile in each move. From the construction of the diamonds Di 
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it follows that all diamonds, and hence the whole board, are flooded after the last 
character of s' has been processed. 

It remains to be shown that at least 2£ moves are necessary to flood B. Let s" 
be a string over the alphabet E (J E' that specifies a shortest sequence of moves 
that would flood the whole board B. From the construction of the diamonds Di 
it follows that the string obtained from s" by removing every character in U' 
is a common supersequence of si, . . . ,Sk and therefore has length at least £. By 
symmetry (replace every aj with bj in the strings si, . . . , Sk), the string obtained 
from s" by removing every character in E has length at least £ as well. Thus, the 
length of s" is at least 2i. 

Since the decision version of SCS is NP-complctc even for a binary alphabet 
S, it follows that c-Flood-It is NP-hard for c ^ 4, and the decision version 
is NP-complete. As discussed above, observe that the number of colours used is 
exactly twice the alphabet size of the SCS instance. Therefore the inapproxima- 
bility result for an unbounded number of colours in the statement of the lemma 
follows immediately from the approximation preserving nature of the reduction 
given. 

Now we show how to extend these results to c-Free-Flood-It. The reduction 
from SCS is similar to the previously presented reduction. However, instead of 
constructing only one board B, we construct 2kw + 1 copies of B and put them 
together to one large n'xn' board B' . If necessary in order to make B' a square, 
we add sufficiently many nxn boards that are filled only with the colour bi. Note 
that {2k'w + l)n and hence {2k'w + l)k{4w + 1) is a generous upper bound on n'. 

Prom the construction of B' it follows that exactly 2£ moves are required to 
flood B' if we flood fill from the top left tile in each move; all copies of B will be 
flooded simultaneously. The question is whether we can do better by flood filling 
from tiles other than the top left one (or any tile in its connected component). 
That is, can we do better by picking a tile inside one of the diamonds? We will 
argue that the answer is no. First note that 2£ ^ 2kw. Suppose that we do flood fill 
from a tile inside some diamond D for some move. This move will clearly not affect 
any of the other diamonds on B'. Suppose that this move would miraculously flood 
the whole of D in one go so that we can disregard it in the subsequent moves. 
However, there were originally 2kw + 1 copies of D, which is one more than the 
absolute maximum number of moves required to flood B' , hence we can use a 
recursive argument to conclude that flood filling from a tile inside a diamond will 
do us no good and would only result in more moves than if we choose to flood fill 
from the top left tile in each move. □ 

The reduction in the previous proof is approximation preserving, which al- 
lowed us to prove that there is no efficient constant factor approximation al- 
gorithm. We reduced from an instance of SCS by doubling the alphabet size, 
resulting in instances of c-Flood-It and c-Free-Flood-It with c ^ 4 colours. 
To establish NP-hardness for c = 3 colours, we need to consider a different re- 
duction. We do this in the lemma below by reducing from the decision version of 
SCS over a binary alphabet to the decision versions of 3-Flood-It and 3-Free- 
Flood-It. This reduction is not approximation preserving as in the previous 
proof; the number of moves required to flood the board in the reduced instance 
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(b) 



Fig. 4: An example of (a) a diamond and (b) a rectangle constructed in the proof 
of Lemma 2. 



of 3-Flood-It (or 3-Free-Flood-It) does not correspond in a straightforward 
way to the length of shortest common supersequence in the SCS instance we 
reduce from. 

Lemma 2. 3-Flood-It and 3-Free-Flood-It are NP-hard (and the decision 
versions are ^P -complete) . 

Proof. We reduce from an instance of the decision version of SCS on k strings 
si, . . . , Sfc of length at most w over the binary alphabet {1,2} and an integer i. 
The yes/no question is whether there exists a common supersequence of length 
at most I. 

For i G {1, . . . , A:}, let s[ be the string obtained from Sj by inserting the new 
character 3 at the front of Si and after each character of Sj. Let the set {1,2,3} 
represent the colours that we will use to construct a board B. First, for each of 
the k strings s[ we define the diamond Di exactly as in the proof of Lemma 1 
(see Figure 4a). We define R to be the following rectangular area of the board 
of width 41 + 5 and height 2i + 3. Let x be the middle tile at the bottom of R. 
Around x we have layers of concentric half rectangles (see Figure 4b). We refer 
to these layers as arches, with the first arch being x itself. As demonstrated in 
the figure, the first arch has the colour 1 and the second arch has the colour 2. 
All the remaining odd arches have the colour 3, and all the remaining even arches 
are coloured 2 everywhere except for the tile above x which has the colour 1. As 
described in detail below, the purpose of these arches is to control which minimal 
sequences of moves would flood B. 

Let i? be a sufficiently large nxn board constructed as follows. First colour 
the whole board with the colour 3. Then, at the bottom of B starting from the left, 
place 2^ + 3 copies of R one after another without any overlaps. Finally place the 
k diamonds Di on B such that no two diamonds overlap and no diamond overlaps 
any copy of R. Figure 5 illustrates a board B with i = 2 and k = 10. Since a 
diamond has a radius of at most 2w + l and i ^ kw, k{4:W + l) + {2kw + 3){4:kw + 5) 
is an upper bound on n. 

The reason why we place copies of R on the board B is to make sure that 
at least 2^ + 2 moves are required to flood B, even in the absence of diamonds. 
To see this, suppose first that we flood flll from the top left square in each move. 
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Fig. 5: A board constructed in the proof of Lemma 2. 

From the definition of the arches of R, disregarding the diamonds on B, a minimal 
sequence of moves will consist of I Is or 2s interspersed with a total of £ — 1 3s, 
followed by the three moves 3, 2 and 1, respectively. Note that only one copy of 
R on B would be enough to achieve this. However, having several copies of R on 
B does not affect the minimum number of moves as all copies will get flooded 
simultaneously. The idea with the 2i + 3 copies of R is to make sure that at least 
2^ + 2 moves are required to flood B even when we are allowed to choose which 
tile to flood fill from in each move. To see this, suppose that we choose to flood 
fill from a tile inside one of the copies of R. Since there are 2^ + 3 copies, similar 
reasoning to the end of the proof of Lemma 1 tells us that we will do worse than 
2i + 2 moves. 

We will now argue that the number of moves required to flood i? is 2^ + 2 if 
and only if there is a common supersequence of si, . . . , of length at most i. We 
choose to flood fill from the top left tile in each move. 

Suppose first that there is a common supersequence s of length £' ^ £. Let s' 
be the string s followed hy i — i' Is. Let s" be the {21 + 2)-long string obtained 
from s' by inserting a 3 after each character of s' and adding the two additional 
characters 2 and 1 to the end. We make 2^ + 2 moves by choosing the colours in 
the same order as they appear in s". Note that all diamonds are flooded after 2i' 
moves, and by the last move we have also flooded every copy of R, and hence the 
whole board B. 

Suppose second that B can be flooded in 2£ + 2 moves. The centre of each 
diamond has the colour 3 and therefore the first 21 moves flood the diamonds. 
The subsequence of these first 2i moves induced by the colours 1 and 2 is an 
£-long common supersequence of si, . . . , s^. □ 

We can now summarise Lemmas 1 and 2 in the following theorem. 

Theorem 1. For c ^ 3, c-Flood-It and c-Free-Flood-It are NP-hard (and 
the decision versions are 'NF -complete). Further, for an unbounded number of 
colours c, there is no polynomial-time constant factor approximation algorithm, 
unless P = NP. 

For two colours, 2-Flood-It is trivially in P, but it is not that obvious what 
the complexity of 2-Free-Flood-It is. The next theorem settles this question, by 
showing that an optimal strategy for any instance of 2-Free-Flood-It consists 
of flooding from the same tile in each move. 
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Fig. 6: A board (a) before and (b) after mi + m2 moves as discussed in the proof of 
Theorem 2. The sohd and dashed paths give Pi and P2 respectively. In left-to-right 
order, the emphasised tiles are t2,t2,t[ and ti. 

Theorem 2. 2-Free-Flood-It is in P. 

Proof. We first consider the case where we are allowed to flood fill from exactly 
two distinct tiles of the board. At the end of the proof we consider the case where 
flooding from any tile is allowed. 

Suppose there exists a shortest sequence of moves S that floods the board 
from only two tiles ti and t2- Suppose also that ti and t2 belong to different 
connected components during the first mi -|- m2 moves but become connected in 
the {nil + 1^2 + l)th move, where mi is the number of flood filling operations 
from ti and m2 is the number of flood fllling operations from ^2- Suppose without 
loss of generality that in move rrii + m2 -|- 1, we flood fill from ti. Let t'l and 
t'2 be two adjacent tiles such that after mi -|- m2 moves, ti and t'^ belong to the 
same monochromatic region, and t2 and t'2 belong to the same monochromatic 
region. Let Pi be a simple path from ti to t'^ in the board with the monochromatic 
connected components ao, • • • ,Omu such that the ith flood fllling move from ti 
merges Ui with the monochromatic region that contains ti. Thus, ti G oq and 
the whole path Pi is monochromatic after mi flood fllling operations from ti. We 
deflne a path P2 from t2 to t2 similarly. Let /3o) • • • )/3m2 be the monochromatic 
connected components of P2. Figure 6 illustrates the two paths Pi and P2. 

We now show that the area flooded after the flrst mi -|- m2 + 1 moves of S can 
be flooded with mi -|- m2 + 1 flood fllling moves from one single tile t^. Let P3 be 
the path Pi concatenated with a reversed copy of P2. Thus, the monochromatic 
connected components 7^ of P3 are 70 = ao; 7i = "ij • • • iTmi = 7mi+i = 
/Sma, 7mi+2 = /^ma-i, ■ ■ ■ , Tmi+mz+i = Po- Let be a tile in 7^2 and consider a 
series of flood fllling moves from this tile: after the flrst m2 moves, ti and are 
connected, and after the flrst mi -|- 1 moves, t2 and are connected. Once a tile t 
is in the same monochromatic component as t^, flooding from ^3 is equivalent to 
flooding from t. Thus, after a total of mi -|- m2 + 1 flood filling moves from ^3, we 
have effectively performed mi -|- 1 flood fllling moves from ti and m2 flood fllling 
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moves from t2- This is exactly what the first mi + m2 + 1 moves of S do. Hence 
we can replace the moves in S by flooding from a single tile t^. 

Finally we deal with the case where we are allowed to flood fill from any tile. 
Consider a shortest sequence S of moves that flood the board and suppose that 
we flood fill from the tiles ti, . . . , t,., for r > 2. Suppose without loss of generality 
that the first merge of any of these tiles is when we flood fill from ti, which 
connects ti with t2,---,tr', where 2 ^ r' ^ r. Let nii be the number of flood 
filling operations that have taken place from ti before this merge. The following 
sequence S' of moves will flood the board in at most l^l moves but flood fills from 
only r — 1 tiles. For i = 3, . . . ,r, first perform rrii flood filling operations from tj. 
Instead of flooding from ti and t2 separately, we use the result above and flood fill 
from a different tile t' . Thus, by the next mi + m2 + 1 moves we have connected 
ti, . . . ,tr'- The subsequent moves of S' follow those of S, where any move at ti or 
t2 is replaced with a move at t' . Inductively we reduce the number of tiles to flood 
fill from to a single tile. The conclusion is that we can solve 2-Free-Flood-It 
by attempting to flood the entire board from each tile of the board in turn, which 
requires only polynomial time. □ 

4 The complexity of constant height boards 

So far we have analysed the complexity of Flood-It on square shaped nxn 
boards. A natural question to ask is: what is the complexity of c-Flood-It on 
an hxn board, where the height /i is a fixed constant? We denote this problem 
by (c, /i)-Flood-It and the 'free' variant by (c, /i)-Free-Flood-It, analogously. 

(c, 1)-Flood-It is trivially in P, and Fleischer and Woeginger have shown 
(personal communication) that (c, 1)-Free-Flood-It is also in P. We will show 
that of (c, 2)-Flood-It on a 2xn board remains in P. However, the complexity 
of (c, 2)-Free-Flood-It remains unresolved. Before stating this result we will 
prove in Theorem 3 that when the number of colours is unbounded and h ^ 3 
then both (c, /i)-Flood-It and (c, /i)-Free-Flood-It are NP-hard. 

For the c-Flood-It problem on a square nxn board with c ^ 4 we gave a 
reduction from the shortest common super sequence problem (SCS) which embed- 
ded a number of diamond structures into a board filled with a single background 
colour. Each diamond represented one of the strings in the SCS instance. The 
problem with this reduction on an hxn board is that a string of length i was 
represented by a diamond with height 4i — 1. This is not possible if /i < 4£ — 1. 
However, Timkovskii proved [17] that the SCS problem remains NP-hard even 
when the length of the strings is constrained to be at most 2, and the alphabet size 
is unbounded. Inspection of the proof of Lemma 1 shows that (c, /i)-Flood-It is 
NP-hard (and the decision version NP-complete) when h ^ 8 and the number 
of colours is unbounded. Naively, it would appear that h = 7 suffices in the proof 
of Lemma 1 as it allows enough height to embed a diamond representing a string 
of length 2 as is required. However, for the reduction to be valid we also need to 
leave at least one row of space above the diamonds so that all diamonds can be 
fiooded simultaneously on any move. 

To reduce the board height required for our NP-hardness proof further we 
reduce the height of the diamond structures used in the reduction. Recall that the 
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reduction begins by doubling the length of all strings in a way that ensures that no 
string contains a character which is followed immediately by another occurrence 
of the same character. We now show in Lemma 3 that the SCS problem remains 
NP-hard even when all the strings are of the form ab where a,b £ U and a ^ b. 
The proof is by reduction from the SCS problem with the constraint that all 
strings have length at most 2. This result allows us to remove the doubling step 
and reduce the height of the diamond structures, resulting in Theorem 3 which 
gives the desired result. 

Lemma 3. The SCS problem is NP-Ziard when all the strings are of the form 
ab where a,b G E and a ^ b. 

Proof. Let S be an instance of SCS that contains k strings si,...,Sk each of 
length w ^ 2 over the alphabet E. We abuse notation by referring to S as both 
the instance and the set of k strings. We begin by assuming that S contains 
a string of length 1. Without loss of generality let Sk be such a string. Let S' 
be the instance of SCS formed by the k — 1 strings si, . . . s^-i- There are two 
cases to consider. In the first case, the single character, a, in Sk occurs in some 
string si,...,Sk-i- Therefore any common supersequence of 5' contains an a 
and hence is also a common supersequence of S. Further, as 5 is a superset of 
S' , any common supersequence of 5 is a common supersequence of S' . Hence 
|SCS(5)| = |SCS(S")|. In the second case, the single character, a, in does 
not occur in si, . . . , Sk-i- A common supersequence for 5 can therefore be found 
by inserting a at the end of the shortest common supersequence for S'. Hence 
|SCS(5)| ^ |SCS(S")|+1. Further, any common supersequence for S must contain 
an a and is also a common supersequence for S'. Therefore by removing the a we 
have that |SCS(5)| ^ |SCS(S')| + 1. Hence in this case |SCS(5)| = |SCS(5')| + 1. 
Repeated application of the above technique gives a poly-time reduction from the 
SCS problem with strings of length u; ^ 2 to the SCS problem with strings of 
length w = 2. Therefore we have that the latter is also NP-hard. 

We now redefine S to be an instance of SCS that contains k strings si, . . . ,Sk 
each of length exactly 2 over the alphabet E. We begin by assuming that S con- 
tains a string of the form aa where a €z U. Without loss of generality let Sk be such 
a string. Let S' be the instance of SCS formed by the k — 1 strings si, . . . Sk-i and 
new strings s^+i = aa' and Sk+2 = cl'cl where a' does not occur in 5. First consider 
the shortest common supersequence of S, which must contain aa as a subsequence. 
By inserting a' between these two occurrences of a, we obtain a common super- 
sequence of S' of length |SCS(5)| + 1. Therefore |SCS(5')| ^ |SCS(5)| + L Now 
consider the shortest common supersequence of 5', which must contain either aa'a 
or a'aa' as a subsequence. In the former case by removing the a' symbol we obtain 
a common supersequence of 5 and have that |SCS(S")| ^ |SCS(5')| + 1. In the 
latter, when we remove the two occurrences of a' we obtain a common superse- 
quence of si, . . . , Sk-i of length |SCS(5')| — 2. This sequence contains exactly one 
occurrence of a, and by inserting a second we obtain a common supersequence of 
S of length |SCS(5')| - 1. Therefore |SCS(5')| = |SCS(5)| + 1. Repeated appli- 
cation of the above technique gives a poly-time reduction from the SCS problem 
with strings of length 2 to the SCS problem with strings of the form ab where 
a,b G U and a ^ b. Therefore we have that the latter is also NP-hard. □ 



11 



Fig. 7: An example of a board constructed in the proof of Theorem 3. In left- 
to-right order, the strings embedded are "23", "12", "32" and "21". The shortest 
common supersequence is "2132". 

Theorem 3. (c, /i)-Flood-It and (c, /i)-Free-Flood-It are NP-hard when h ^ 

3 and the number of colours c is unbounded (and the decision versions are NP- 
complete). 

Proof. First observe that the decision versions of both problems are in NP be- 
cause the unconstrained versions, c-Flood-It and c-Free-Flood-It, are in NP. 
We begin by considering the (c, /i)-Flood-It problem for /i ^ 3. We reduce from 
an instance of SCS on k strings si, . . . ,Sfc over the alphabet S and an integer 
I. The strings are constrained to have the form ab where a,b G U and a ^ b. 
Let B he a. hxn board filled with a single background colour where n = 4A; + 1. 
For each symbol in U we have a corresponding distinct colour in addition to the 
background colour. For each string Si = Oibi we embed a 'half diamond against 
the bottom edge of the board. The half diamond consists of a single tile of colour 
bi (the inner layer), surrounded on all three sides by a tile of colour (the outer 
layer). This is illustrated in Figure 7 for h = 3. 

Observe that as h ^ 3 and n = 4A; + 1, all the half diamonds can be placed 
so that the outer layer of each half diamond is surrounded by the background 
colour. Therefore on any move, the outer layer of any half diamond can be flooded. 
Further observe that for all i, as 7^ 6j the diamond for Si is flooded if and only 
if the move sequence contains subsequence. Therefore a move sequence 

floods the board if and only if it is a common supersequence of si,...,Sk, so 
|SCS(S')| equals the length of the shortest move sequence which floods the board. 
As this reduction can be implemented in polynomial time, we have that (c, h)- 
Flood-It problem is NP-hard with an unbounded number of colours. 

We now consider the (c, /i)-Free-Flood-It problem for h ^ 3. NP-hardness 
follows by the same argument as for the NP-hardness of c-Free-Flood-It for c ^ 

4 given in the proof of Lemma 1. We increase the size of the board (horizontally) 
and embed 2k + 1 copies of each half diamond. We observe that any flood-filling 
move begun from a tile in an unflooded half diamond floods only tiles in that 
half diamond. This ensures that any move sequence which floods the board and 
contains moves begun from tiles in an unflooded half diamond either contains at 
least 2k + 1 moves or contains redundant moves. In either case, it is not minimal. 

□ 

We flnally show that (c, 2)-Flood-It is in P. 

Theorem 4. For any c ^ 1, c-Flood-It on a2xn board is in P. More precisely, 
the running time is 0{n). 

Proof. Suppose that i? is a 2xn board and c is the number of colours. We say 
that a tile t on i? is marked if it has colour q and no other tile in the columns 
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strictly to the right of t has the colour q. A column is marked if it contains a 
marked tile. 

The key observation, which holds on a 2 x n board, is that if the marked tiles 
are flooded then so is the whole board B. To see this, note that when a marked 
tile t of colour q is flooded, all other tiles of the colour q that have not yet been 
flooded are to the left of t and therefore adjacent to the flooded region. Hence 
they will be flooded when t is flooded. Thus, we ask for the shortest sequence of 
moves that would flood the marked tiles. 

A shortest path to a tile t denotes a shortest sequence of flood filling operations 
that includes t in the flooded region. If t is already included in the flooded region, 
then the length of the shortest path to t is 0. 

One might think that a solution to c-Flood-It on a 2 x n board would be to 
go from one marked tile to the next in left-to-right order using shortest paths. 
Although this is correct, we must be a little careful with which shortest paths we 
choose. The following procedure floods the marked tiles in the smallest number 
of moves possible. 

Beginning of procedure. Let i be the leftmost marked column such that i 
contains a marked tile t that has not yet been flooded. Let t' be the other tile 
in column i. We have two cases. 
Case 1 {t' is unmarked). Let m and m' be the lengths of the shortest 
paths to t and t', respectively. Note that \m — m'\ ^ 1. We consider two 
subcases. 

Case la (m ^ m!). Flood using the sequence of colours found along 

the shortest path to t, then go to the beginning of the procedure. 
Correctness: Flooding t' before t means that we are bound to flood t 
at a later stage. Once t' is flooded we can never do worse by flooding 
t immediately. Thus, flooding t' before t and then flooding t takes a 
total of at least m + 1 moves. However, flooding t takes m moves and 
we are not necessarily forced to spend an extra move on flooding t' , 
which is not a marked tile. 
Case lb (m > m') . Flood using the sequence of colours found along the 
shortest path to t' and then flood t. Then go to the beginning of the 
process. Correctness: Flooding t takes at at least m' + 1 steps, even if 
we do not go via t' . Since all remaining marked tiles are to the right of 
column i, we should therefore flood t' before t. Once t' is flooded, we 
can never do worse by flooding t immediately. 
Case 2 {t' is marked). Flood using the sequence of colours found along 
the shortest of the shortest paths to t or t' . Then flood the remaining tile 
in column i. Then go to the beginning of the process. Correctness: Both 
t and t' must eventually be flooded. Once one of them is flooded, there is 
no reason to wait to flood the other. 

Using for example dynamic programming, the shortest path to a tile t on a 
2xn board can be computed in time linear in the distance between the flooded 
region and t. We note that the shortest paths are always calculated between the 
rightmost end of the flooded region and a marked column i. Since the flooded 
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region is always extended to column i in each step of the procedure, the total 
running time of computing the shortest paths is linear in n. Hence the running 
time of the whole process is 0{n). □ 



5 Approximating the number of moves 

As we have seen, c-Flood-It and c-Free-Flood-It are not efficiently approx- 
imable to within a constant factor for an unbounded number of colours c. How- 
ever, a (c — l)-approximation for c-Flood-It, c ^ 3, can easily be obtained as 
follows. Suppose that S is a board on the colours 1, . . . ,c. Clearly, if we repeat- 
edly cycle through the sequence of colours 1, . . . ,c then B will be flooded after 
at most c X m{B) moves. We can do a little better by first cycling through the 
ordered sequence of colours 1, . . . ,c and then repeatedly alternating between a 
cycle of the sequence (c — 1), . . . , 1 and a cycle of 2, . . . , c until there are only 
two distinct colours left on the board, after which we alternate between the two 
remaining colours. Note that there are always exactly two distinct colours left 
before the final move. The board B is guaranteed to be fiooded after at most 
c + (c — l)(m{B) — 2) + 1 ^ (c — l)m{B) moves, which gives us a (c — 1)- 
approximation algorithm. 

A randomised approach with an expected number of moves of approximately 
2c/2>xm{B) is obtained as follows. Suppose that s is a minimal sequence of colours 
that fioods B (fiood filling from the top left square in each move). We shuffle the 
c colours and process them one by one. If B is not flooded then we shuffle again 
and repeat. Note that this procedure could (and most likely will) generate many 
useless moves that do not merge any monochromatic regions. Thus, if m{B) = 1 
then the algorithm could take up to c moves, although a single move would sufflce. 
If m{B) = 2 then c + \c = 3c/2 is an upper bound on the expected number of 
moves; with probability 1/2, the two moves in s appear in the same order as in the 
shuffled sequence of colours, and if not, we might have to shuffle the colours again 
and repeat one last time. We generalise this as follows. Let T(m) be (an upper 
bound on) the expected number of moves it takes to produce a flxed sequence of 
m moves. We have T{m) = c + ^T{m — 1) + ^T(m — 2). Solving the recurrence 
with the values of r(l) and T(2) above gives us a solution in which T{m) is 
asymptotically (2c/3)m for a fixed c. 

6 General bounds on the number of moves 

Recall that we denote the minimum number of moves which flood some board B 
as m{B). In this section we investigate bounds on the maximum m{B) over all 
boards in B^^c which we denote max{m(i?) | B G i?n,c}- Intuitively, this can be 
seen as the minimum number of moves to flood the 'worst' board in Bn^c- 

For motivation, consider an nxn checker board of two colours as shown in 
Figure 8. First observe that as the board has only two colours, the player has no 
choice in their next move. Consider a diagonal of tiles in the direction top-right 
to bottom-left where the 0th diagonal is the top-left corner. Further observe that 
move k floods exactly the kth diagonal, so the total number of moves is 2(n — 1). 
Thus we have shown that max{m{B) \ B e -B„_c} ^ 2(n — 1). 
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Fig. 8: Progression of a 6x6 checker board. 



We now give an overview of a simple algorithm which floods any board in 
Bn^c in at most c(n — 1) moves. The algorithm performs n stages. The purpose 
of the ith stage is to flood the ith. row. Stage i repeatedly picks the colour of the 
leftmost tile in row i which is not in the flooded region, until row i is flooded. 

First observe that Stage 1 performs at most n — 1 moves to flood row i (we 
can flood at least one tile of row 1 per move). When the algorithm begins Stage 
i ^ 2, observe that row i — 1 is entirely flooded as well as any tiles in row i 
which match the colour of row i — 1. Therefore when a new colour is selected, 
all tiles in row i of this colour become flooded. Hence at most c — 1 moves are 
performed by Stage i. Summing over all rows, this gives the desired bound that 
max{m(-B) | B £ Bn^c} ^ c{n — 1). Observe that from the previous example with 
the checker board on c = 2 colours, the bound c{n — 1) is tight. Thus, the checker 
board is the 'worst' board in 5^,2- 

As motivation, we have given weak bounds on max{m(i?) | B € B^^c}- We now 
tighten these bounds for large c by providing a better algorithm for flooding an 
arbitrary board. We will also give a description of 'bad' boards which require many 
moves to be flooded. It will turn out that ma,x{m{B) \ B G Bn^c} is asymptotically 
0{y/cn) for increasing n and c. 

Theorem 5. There exists a polynomial time algorithm for Flood- It which can 
flood any nxn board with a colours in at most 2n + {\/^)n + c moves. 

Proof. For a given integer I (to be determined later), we partition the board 
horizontally into i+ 1 contiguous sections, denoted 5*0, . . . ,Si from top to bottom, 
as follows. Let q = [n/i\ and r = n mod i. Section Sq consists of the first \q/2\ 
rows, Si,...,Sr contain {q + 1) rows each (if r > 0), and Sr+i, • • • , S^^i contain 
q rows each (if r < £ — 1). Section Si contains [q/2\ rows. See Figure 9 for an 
illustration. We let y{i) denote the final row of Si. 
The algorithm performs the following three stages. 

Stage 1. Flood the first column. 

Stage 2. Flood row y{x) for all ^ x < 

Stage 3. Cycle through the c colours until the board is flooded. 

The correctness of our algorithm is immediate as Stage 3 ensures that the 
board is flooded by cycling colours. Stage 1 can be implemented to perform at 
most n — 1 moves as argued for the simple algorithm above. Similarly, Stage 2 
can be completed in ^(n — 1) moves. We now analyse Stage 3. 

First consider 5*0. At the start of Stage 3, row ?/(0) is entirely in the top-left 
region, so a single cycle of the c colours suffices to expand the region to include 
row y(0) — 1. Each subsequent cycle of c colours expands the region to include an 
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Fig. 9: The board decomposition used in 
the proof of Theorem 5. 
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Fig. 10: 4-diamonds packed in a 
20x20 board. 



additional row. Therefore, after c([g/2] — 1) ^ cq/2 moves of Stage 3, all rows 
above y(0) are included in the top left region. Similarly, the section Sn will be 
included in the top-left region as it contains [(7/2J ^ q/2 rows. 

Now consider section Si for some < i < i. Observe that there are at most q 
rows in Si which are not already completely in the top-left section (after stage 2). 
Further observe that any cycle of c colours expands the region to include two more 
of these rows. One row is gained from the region bordering the top of the section 
(which is in the top-left region from stage 2). The second is gained from the region 
bordering the top of the section (which is also in the top-left region from stage 
2). Therefore after at most c\q/2] moves of Stage 3 the board is flooded. 

Over all three stages this gives a total of at most n + in + c\q/2\ moves. We 
pick £ = c/2] to minimise this number of moves. By recalling that q = [n/i\ 
and simplifying we have that this total is less than 2n + \f2cn + c moves as 
required. □ 

Theorem 6. For 2 ^ c ^ v? , there exists an nxn board with (up to) c colours 
which requires at least \/c — 1 n/2 — c/2 moves to flood. 

Proof. Suppose first that c is even. For a given integer r ^ 1, let D(^^^y^ be an 
r-diamond where odd layers are coloured x and even layers are coloured y. Any 
board containing D[x,y) requires at least r moves of colours x and y. Further, 
observe that as long as the centre of D{x,y) is in the board, even if it is cropped 
by at most two edges of the board, at least r moves of colours x and y are still 
required (see Figure 2b). We refer to such an r-diamond as good. The central 
idea is to populate the board with good r-diamonds, 2); -^^(3,4); • • • ; -C)(c-i,c)- 
As each r-diamond uses two colours (or one of the two colours if r = 1) which 
do not occur in any other diamond, the board must take at least rc/2 moves to 
flood. 

It is not difficult to show that at least (n^ — r^)/(2r^) good r-diamonds can 
be embedded in an n x n board. An example of such a packing for a 20 x 20 board 
is given in Figure 10 (which shows only the edges of diamonds and not their 
colouring). This scheme generalises well to an nxn board but the details are 
omitted in the interest of brevity. 

We now take r = [n/y/c\ < n/2 and note that r ^ 1. As r < n/2, the r- 
diamonds are cropped by at most two board edges as required. Therefore we have 
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at least (re^ — r^)/(2r^) ^ c/2—1/2 good r-diamonds in our board. However, as the 
number of good r-diamonds is an integer, this is at least c/2 as required. Therefore, 
the number of moves required to flood this board is at least rc/2 > n\/c/2 — c/2 . 

Finally, in the case that c is odd we proceed as above using c — 1 of the colours 
to give the stated result. □ 

The next corollary is immediate from Theorems 5 and 6. 

Corollary 1. {\/c- ln-c)/2 ^ max{m(B) | B G Bn^c} ^ 2n + y/2cn + c . 

7 Random boards 

In this section, we try to understand the complexity of a random Flood-It board - 
that is, a board where each tile is coloured uniformly at random. This question is 
of both theoretical and practical interest. A common initialisation for Flood-It is 
to pick the colours of tiles at random and the game designer will surely be keen to 
know if they are likely to have chosen an instance whose solution is trivially short. 
The option of having to solve every created instance to test for this possibility 
is also likely to be unattractive, especially given the complexity results shown in 
this paper. Intuitively, one would expect random boards to usually require a large 
number of moves to flood. Determining how many moves are actually needed 
turns out to be closely related to a body of research in percolation theory, the 
study of connected clusters in random graphs. 

Indeed, a problem in percolation theory that is essentially equivalent to the 
question of the number of moves required for a random Flood-It board has been 
solved quite recently by Chayes and Winfield [3], and independently Fontes and 
Newman [7]. In our terminology, their result was that a random nxn Flood- It 
board with c ^ 2 colours requires Q[n) moves with high probability. The proofs 
are lengthy and use some deep previous results in percolation theory. 

We now present a greatly simplified proof of the results of [3,7], in the case 
that c ^ 3. Formally, our result is as follows. 

Theorem 7. Let B G Bn^c be a board where the colour of each tile is chosen 
uniformly at random from {l,...,c}. Then, for c ^ 4, Pr[m(B) ^ 2(3/10 — 
l/c)(n - 1)] < e-^("). For c = 3, Fi[m{B) ^ (n - l)/22] < e^^^"). 

In order to prove this theorem, we will use two lemmas concerning paths in 
Flood-It boards. Let P be a simple path in a Flood-It board, i.e. a simple path on 
the underlying square lattice^, where tiles are vertices on the path. Note that a 
path of length k includes k+1 tiles. We say that a simple path P is non-touching if 
every tile in P is adjacent to at most two tiles that are also in P. Define the cost of 
P, cost(P), to be the number of maximal monochromatic connected components 
of the path, minus one (so a monochromatic path has cost 0). 

^ Simple paths on square lattices have been intensively studied, and are known as self- avoiding 
walks [12]. There are known upper bounds, which are slightly stronger than Lemma 5, on 
the number of self-avoiding walks of a given length; however, we avoid these here to keep our 
presentation elementary. 
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Lemma 4. For any B G -Bn,c> there is a non-touching path from (1, 1) to {n,n) 
with cost at most m{B). 

Proof. For m{B) = there is nothing to prove, so consider a strategy for com- 
pleting B which uses m{B) > moves. Label every tile t G B with an integer 
m{t) between and m{B) that indicates the number of the move which changed 
the colour of t to be the colour of tile (1, 1). Then, for each i ^ 1, there is a 
connected component labelled with i which has at least one neighbour labelled 
with z — 1. As the label of (n, n) is at most m(B), and the label of (1, 1) is 0, there 
is a simple path from (1, 1) to (n, n) with cost at most m{B). This path can be 
taken to be non-touching, because any pair of adjacent tiles (ti,t2) that are on 
the path but not connected by it correspond to a loop in the path that can be 
removed without increasing the cost. □ 

Lemma 5. For any integer £ ^ 3, there are at most 4 • 7^^^^^^'^ < 2 • {V^Y 
non-touching paths of length I from any given tile. 

Proof. Let T{£) denote the maximum number of non-touching paths of length 
£ starting from any given tile. T{£) can be straightforwardly upper bounded by 
4 • 3^~^ for £ ^ 1, as with each step of the path, aside from the first, there are at 
most 3 choices of direction. We get a tighter bound by analysing a few steps on 
a non-touching path P. Consider the ith vertex on P, for some z ^ 2. As P is 
simple, there are at most 3 choices for the {i -\- l)th vertex of the path. For vertex 
i -f 2, if the previous two steps were in the same direction, there are at most 3 
more choices. On the other hand, if the previous two were in different directions, 
there are only at most 2 choices (otherwise, the path would go back on itself, and 
would not be non-touching). In total, there are only at most 7 possible options 
for vertices i + 1, i + 2. Therefore, for 

any £ ^ 3, we have r(^) ^ 4 • 7(^-^)/2^ □ 

The last result we will need is the following Chernoff-Hoeffding bound. 

Fact 1 (HoefFding [8]). Let Xi, 1 ^ i ^ m, be independent 0/1-valued random 
variables with Pr[Xj = 1] = p then, 



where D{x\\y) is the Kullback-Leibler divergence D{x\\y) = xln(x/y) + (1 — 



We are finally ready to prove Theorem 7. 

Proof (of Theorem 7). For any k ^ 0, and for any board B such that m{B) ^ k, 
by Lemma 4 there exists a non-touching path from (1,1) to {n,n) with cost at 
most k. So consider an arbitrary non-touching path P in i? of length £ between 
these two tiles, and let Pi denote the ith tile on the path, for 1 ^ i ^ £ -\- 1. 
Note that £ ^ 2(n - 1). Then cost(P) = \{i : Pi+i + or equivalently 

cost(P) = £ —\{i : Pj+i = Pi\\. Define the 0/1-valued random variable Xi by 




x)ln((l-x)/(l-y)). 
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Xi = 1 4^ -Pj+1 = Pi- Then, as the colours of tiles are uniformly distributed, 
Pr[Xj = 1] = 1/c for all i, and 



Pr[cost(P) < /c] = Pr 



-D{l-k/£\\ l/c)e 



where we use Fact 1. Thus, using the union bound over all paths of length at least 
2(n — 1) from (1,1) to (n,n), we get that the probability that there exists any 
path of cost at most k is upper bounded by 

oo oo 
2 {^^-^(-(.-m-kimicY ^2 ^ g((l/2)ln7-D(l-fc/£||l/c)K^ ^^^^ 

i=2{n-\) £=2(n-l) 

where we use the estimate for the number of paths which was derived in Lemma 5. 
In the final part of the proof, we consider the cases c ^ 4 and c = 3 separately. 

First suppose that c ^ 4. We take k = 2(3/10 - l/c)(n - 1) ^ (3/10 - l/c)l, 
as in the statement of the theorem, and use D{1 — kji \ \ 1/c) ^ 2(1 — kjl — l/cf' 
(from Fact 1) to obtain the bound 

oo oo 
2 ^ g((l/2)ln7-2(l-fc/f-l/c)2K ^ 2 ^ g((l/2)ln7-49/50K_ 

£=2(n-l) e=2(n-l) 

As 49/50 > (1/2) In 7 ~ 0.973, this sum is exponentially small in n. 

Lastly, suppose that c = 3. In this case, our choice of k above is negative. 
Instead we take k = [n — l)/22, which implies 1 — k/^ ^ 43/44. In order to 
obtain a sufficiently tight bound on Z)(l — /c/^ 1 1 1/c), we use the explicit formula 
in Fact 1 to show that D(43/44|| 1/3) > 0.974 > (1/2) In 7, which implies that 
there is a bound in Equation (1) which is exponentially small in n. This completes 
the proof. □ 



8 Conclusion and open problems 

We have shown that, for three or more colours, Flood-It is NP-hard. However, 
for two colours, the relaxed version of the problem termed Free-Flood-It in 
which we are allowed to flood fill from any tile of the board remains in P. Some 
interesting open questions remain. First, the complexity of solving Free-Flood- 
It on a height 2 board remains unresolved. Second, we conjecture that the true 
lower bound for random boards is fi{^/cn), matching the general upper bound. 
Interestingly, the percolation theory techniques that we are aware of do not appear 
to allow for super-linear lower bounds of the sort that would be required. 
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